The Mystery of Two Straight Lines in Bacterial Genome Statistics. Release 2007
نویسندگان
چکیده
In special coordinates (codon position–specific nucleotide frequencies) bacterial genomes form two straight lines in 9-dimensional space: one line for eubacterial genomes, another for archaeal genomes. All the 348 distinct bacterial genomes available in Genbank in April 2007, belong to these lines with high accuracy. The main challenge now is to explain the observed high accuracy. The new phenomenon of complementary symmetry for codon position– specific nucleotide frequencies is observed. The results of analysis of several codon usage models are presented. We demonstrate that the mean–field approximation, which is also known as context–free, or complete independence model, or Segre variety, can serve as a reasonable approximation to the real codon usage. The first two principal components of codon usage correlate strongly with genomic G+C content and the optimal growth temperature respectively. The variation of codon usage along the third component is related to the curvature of the mean-field approximation. First three eigenvalues in codon usage PCA explain 59.1%, 7.8% and 4.7% of variation. The eubacterial and archaeal genomes codon usage is clearly distributed along two third order curves with genomic G+C content as a parameter.
منابع مشابه
The mystery of two straight lines in bacterial genome statistics.
In special coordinates (codon position-specific nucleotide frequencies), bacterial genomes form two straight lines in 9-dimensional space: one line for eubacterial genomes, another for archaeal genomes. All the 348 distinct bacterial genomes available in Genbank in April 2007, belong to these lines with high accuracy. The main challenge now is to explain the observed high accuracy. The new phen...
متن کاملA pr 2 00 5 The Mystery of Two Straight Lines in Bacterial Genome Statistics
In special coordinates (codon position–specific nucleotide frequencies) bacterial genomes form two straight lines in 9-dimensional space: one line for eubacterial genomes, another for archaeal genomes. All the 175 known bacterial genomes (Genbank, March 2005) belong these lines with high accuracy, and these two lines are certainly different. The results of PCA analysis of codon usage and accura...
متن کاملبررسی تنوع درون ژنومی عملکرد دانه و اجزای آن در شرایط تنش خشکی و نرمال با استفاده از لاینهای جایگزین کروموزومی گندم
In order to evaluate intra-genomic variation and regression analysis of grain yield and its components using two wheat substitution lines series including substitution lines of ‘Timstein’ and ‘Red Egyptian’ into genetic background of ‘Chinese Spring’ and their parents in a randomized complete block design with four replications under water-stress and non-stress conditions in a greenhouse at 201...
متن کاملOPTIMAL NOZZLE SHAPES OF CO2-N2-H2O GASDYNAMIC LASERS
In an axisymmetric CO2-N2-H2O gas dynamic laser, let ? denote the intersection of the vertical plane of symmetry with the upper part of the (supersonic) nozzle. To obtain a maximal small signal gain, some authors have tested several families of curves for ?. To find the most general solution for ?, an application of Pontryagin’s principle led to the conjuncture that the optimal ? must consist o...
متن کاملProduction of transgenic Paulownia tomentosa (Thunb.) steud. using chitosan nanoparticles to express antimicrobial genes resistant to bacterial infection
Paulownia tomentosa (Thunb.) Steud. is a very important hard woody plant, an extremely fast-growing tree and produce timber. Therefore, there is a demand to produce transgenic Paulownia plant resistant to bacterial infection. Microbial infection (especially bacterial one) is serious sever and cause a loss in plant productivity as they bear upon the character and amount of plan...
متن کامل